Parallel Processing of Multi-Join Expansion_aggregate Data Cube Query in High Performance Database Systems
نویسندگان
چکیده
Data cube queries containing aggregate functions often combine multiple tables through join operations. We can extend this to “Multi-Join Expansion_Aggregate” data cube queries by using more than one aggregate functions in “SELECT” statement in conjunction with relational operators. In parallel processing for such queries, it must be decided which attribute to use as a partitioning attribute, in particular, join attribute or cube-by. Based on the partitioning attribute, we introduce three parallel multi-join expansion_aggregate data cube query methods, namely Multi-join Partition Method (MPM), Expansion Partition Method (EPM) and Early Expansion Partition with Replication Method (EPRM). All three methods use the join attribute and cube-by as the partitioning attribute. Performance evaluation of the three parallel processing methods is also carried out and
منابع مشابه
On the Performance of Parallel Join Processing in Shared Nothing Database Systems
Parallel database systems aim at providing high throughput for OLTP transactions as well as short response times for complex and data-intensive queries. Shared nothing systems represent the major architecture for parallel database processing. While the performance of such systems has been extensively analyzed in the past, the corresponding studies have made a number of best-case assumptions. In...
متن کاملParallel data cube construction for high performance on-line analytical processing
Decision support systems use On-Line Analytical Processing (OLAP) to analyze data by posing complex queries that require diierent views of data. Traditionally , a relational approach (ROLAP) has been taken to build such systems. More recently, multi-dimensional database techniques (MOLAP) have been applied to decision-support applications. Data is stored in multi-dimensional arrays which is a n...
متن کاملRevisiting the Role of Pipelined Parallelism in Multi-Join Query Processing
Multi-join queries are the core of any integration service that integrates data from multiple distributed data sources. Due to the large number of data sources and possibly high volumes of data, the evaluation of multi-join queries faces increasing scalability concerns. Parallel processing has been applied to tackle this problem. State-of-the-art parallel multi-join query processing commonly as...
متن کاملHigh Performance Data Mining Using Data Cubes on Parallel Computers
On-Line Analytical Processing techniques are used for data analysis and decision support systems. The multidimensionality of the underlying data is well represented by multidimensional databases. For data mining in knowledge discovery, OLAP calculations can be effectively used. For these, high performance parallel systems are required to provide interactive analysis. Precomputed aggregate calcu...
متن کاملParallel Processing of JOIN Queries in OGSA-DAI
JOIN Query is the most important and often most expensive of all relational operations, especially when its input is obtained from considerable size of tables on distributed heterogeneous database. As parallel join processing is a well understood technique to get results as quickly as possible, one way to speed up query execution is to exploit parallelism. Since most real queries involve joins ...
متن کامل